We proposed a probabilistic approach to joint modeling of participants'reliability and humans' regularity in crowdsourced affective studies.Reliability measures how likely a subject will respond to a question seriously;and regularity measures how often a human will agree with otherseriously-entered responses coming from a targeted population.Crowdsourcing-based studies or experiments, which rely on human self-reportedaffect, pose additional challenges as compared with typical crowdsourcingstudies that attempt to acquire concrete non-affective labels of objects. Thereliability of participants has been massively pursued for typicalnon-affective crowdsourcing studies, whereas the regularity of humans in anaffective experiment in its own right has not been thoroughly considered. Ithas been often observed that different individuals exhibit different feelingson the same test question, which does not have a sole correct response in thefirst place. High reliability of responses from one individual thus cannotconclusively result in high consensus across individuals. Instead, globallytesting consensus of a population is of interest to investigators. Built uponthe agreement multigraph among tasks and workers, our probabilistic modeldifferentiates subject regularity from population reliability. We demonstratethe method's effectiveness for in-depth robust analysis of large-scalecrowdsourced affective data, including emotion and aesthetic assessmentscollected by presenting visual stimuli to human subjects.
展开▼